منابع مشابه
Improving speaker identification in TV-shows using person name detection in overlaid text and speech
This paper is dedicated to the use of auxiliary information in order to help a classical acoustic-based speaker identification system in the specific context of TV shows. The underlying assumption is that auxiliary information could help (1) to rerank n-best speaker hypotheses provided by the acoustic-based only speaker identification system, (2) to provide confidence score to refine a rejectio...
متن کاملAnalysis of i-vector framework for speaker identification in TV-shows
Inspired from the Joint Factor Analysis, the I-vector-based analysis has become the most popular and state-of-the-art framework for the speaker verification task. Mainly applied within the NIST/SRE evaluation campaigns, many studies have been proposed to improve more and more performance of speaker verification systems. Nevertheless, while the i-vector framework has been used in other speech pr...
متن کاملMultiBIC: an improved speaker segmentation technique for TV shows
Speaker segmentation systems usually have problems detecting short segments, which causes the number of deletions to be high and therefore harming the performance of the system. This is a complication when it comes to segmenting multimedia information such as movies and TV shows, where dialogs among characters are very common. In this paper a modification of the BIC algorithm is presented, whic...
متن کاملCollaborative annotation for person identification in TV shows
This paper presents a collaborative annotation framework for person identification in TV shows. The web annotation frontend will be demonstrated during the Show and Tell session. All the code for annotation is made available on github. The tool can also be used in a crowd-sourcing environment.
متن کاملStructured prediction for speaker identification in TV series
Though radio and TV broadcast are highly structured documents, state-of-the-art speaker identification algorithms do not take advantage of this information to improve prediction performance: speech turns are usually identified independently from each other, using unstructured multi-class classification approaches. In this work, we propose to address speaker identification as a sequence labeling...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Multimedia Tools and Applications
سال: 2014
ISSN: 1380-7501,1573-7721
DOI: 10.1007/s11042-014-1940-3